Analyzing Hogwild Parallel Gaussian Gibbs Sampling

ثبت نشده
چکیده

Scaling probabilistic inference algorithms to large datasets and parallel computing architectures is a challenge of great importance and considerable current research interest, and great strides have been made in designing parallelizeable algorithms. Along with the powerful and sometimes complex new algorithms, a very simple strategy has proven to be surprisingly useful in some situations: running local Gibbs sampling updates on multiple processors in parallel while only periodically communicating updated statistics (see Section 7.4 for details). We refer to this strategy as “Hogwild Gibbs sampling” in reference to recent work [84] in which sequential computations for computing gradient steps were applied in parallel (without global coordination) to great beneficial e↵ect. This Hogwild Gibbs sampling strategy is not new; indeed, Gonzalez et al. [42] attributes a version of it to the original Gibbs sampling paper (see Section 7.2 for a discussion), though it has mainly been used as a heuristic method or initialization procedure without theoretical analysis or guarantees. However, extensive empirical work on Approximate Distributed Latent Dirichlet Allocation (AD-LDA) [83, 82, 73, 7, 55], which applies the strategy to generate samples from a collapsed LDA model [12], has demonstrated its e↵ectiveness in sampling LDA models with the same or better predictive performance as those generated by standard serial Gibbs [83, Figure 3]. The results are empirical and so it is di cult to understand how model properties and algorithm parameters might a↵ect performance, or whether similar success can be expected for any other models. There have been recent advances in understanding some of the particular structure of AD-LDA [55], but a thorough theoretical explanation for the e↵ectiveness and limitations of Hogwild Gibbs sampling is far from complete. Sampling-based inference algorithms for complex Bayesian models have notoriously resisted theoretical analysis, so to begin an analysis of Hogwild Gibbs sampling we consider a restricted class of models that is especially tractable for analysis: Gaussians.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analyzing Hogwild Parallel Gaussian Gibbs Sampling

Sampling inference methods are computationally difficult to scale for many models in part because global dependencies can reduce opportunities for parallel computation. Without strict conditional independence structure among variables, standard Gibbs sampling theory requires sample updates to be performed sequentially, even if dependence between most variables is not strong. Empirical work has ...

متن کامل

Clone MCMC: Parallel High-Dimensional Gaussian Gibbs Sampling

We propose a generalized Gibbs sampler algorithm for obtaining samples approximately distributed from a high-dimensional Gaussian distribution. Similarly to Hogwild methods, our approach does not target the original Gaussian distribution of interest, but an approximation to it. Contrary to Hogwild methods, a single parameter allows us to trade bias for variance. We show empirically that our met...

متن کامل

Bayesian time series models and scalable inference

With large and growing datasets and complex models, there is an increasing need for scalable Bayesian inference. We describe two lines of work to address this need. In the first part, we develop new algorithms for inference in hierarchical Bayesian time series models based on the hidden Markov model (HMM), hidden semi-Markov model (HSMM), and their Bayesian nonparametric extensions. The HMM is ...

متن کامل

Exact Hamiltonian Monte Carlo for Truncated Multivariate Gaussians

We present a Hamiltonian Monte Carlo algorithm to sample from multivariate Gaussian distributions in which the target space is constrained by linear and quadratic inequalities or products thereof. The Hamiltonian equations of motion can be integrated exactly and there are no parameters to tune. The algorithm mixes faster and is more efficient than Gibbs sampling. The runtime depends on the numb...

متن کامل

Gibbs sampling for fitting finite and infinite Gaussian mixture models

This document gives a high-level summary of the necessary details for implementing collapsed Gibbs sampling for fitting Gaussian mixture models (GMMs) following a Bayesian approach. The document structure is as follows. After notation and reference sections (Sections 2 and 3), the case for sampling the parameters of a finite Gaussian mixture model is described in Section 4. This is then extende...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016